972 research outputs found
Error Analysis of Band Matrix Method
Numerical error in the solution of the band matrix method based on the elimination method in single precision is investigated theoretically and experimentally, and the behaviour of the truncation error and the roundoff error is clarified. Some important suggestions for the useful application of the band solver are proposed by using the results of above error analysis
Control as Probabilistic Inference as an Emergent Communication Mechanism in Multi-Agent Reinforcement Learning
This paper proposes a generative probabilistic model integrating emergent
communication and multi-agent reinforcement learning. The agents plan their
actions by probabilistic inference, called control as inference, and
communicate using messages that are latent variables and estimated based on the
planned actions. Through these messages, each agent can send information about
its actions and know information about the actions of another agent. Therefore,
the agents change their actions according to the estimated messages to achieve
cooperative tasks. This inference of messages can be considered as
communication, and this procedure can be formulated by the Metropolis-Hasting
naming game. Through experiments in the grid world environment, we show that
the proposed PGM can infer meaningful messages to achieve the cooperative task
Integration of Imitation Learning using GAIL and Reinforcement Learning using Task-achievement Rewards via Probabilistic Graphical Model
Integration of reinforcement learning and imitation learning is an important
problem that has been studied for a long time in the field of intelligent
robotics. Reinforcement learning optimizes policies to maximize the cumulative
reward, whereas imitation learning attempts to extract general knowledge about
the trajectories demonstrated by experts, i.e., demonstrators. Because each of
them has their own drawbacks, methods combining them and compensating for each
set of drawbacks have been explored thus far. However, many of the methods are
heuristic and do not have a solid theoretical basis. In this paper, we present
a new theory for integrating reinforcement and imitation learning by extending
the probabilistic generative model framework for reinforcement learning, {\it
plan by inference}. We develop a new probabilistic graphical model for
reinforcement learning with multiple types of rewards and a probabilistic
graphical model for Markov decision processes with multiple optimality
emissions (pMDP-MO). Furthermore, we demonstrate that the integrated learning
method of reinforcement learning and imitation learning can be formulated as a
probabilistic inference of policies on pMDP-MO by considering the output of the
discriminator in generative adversarial imitation learning as an additional
optimal emission observation. We adapt the generative adversarial imitation
learning and task-achievement reward to our proposed framework, achieving
significantly better performance than agents trained with reinforcement
learning or imitation learning alone. Experiments demonstrate that our
framework successfully integrates imitation and reinforcement learning even
when the number of demonstrators is only a few.Comment: Submitted to Advanced Robotic
Symbol emergence as interpersonal cross-situational learning: the emergence of lexical knowledge with combinatoriality
We present a computational model for a symbol emergence system that enables
the emergence of lexical knowledge with combinatoriality among agents through a
Metropolis-Hastings naming game and cross-situational learning. Many
computational models have been proposed to investigate combinatoriality in
emergent communication and symbol emergence in cognitive and developmental
robotics. However, existing models do not sufficiently address category
formation based on sensory-motor information and semiotic communication through
the exchange of word sequences within a single integrated model. Our proposed
model facilitates the emergence of lexical knowledge with combinatoriality by
performing category formation using multimodal sensory-motor information and
enabling semiotic communication through the exchange of word sequences among
agents in a unified model. Furthermore, the model enables an agent to predict
sensory-motor information for unobserved situations by combining words
associated with categories in each modality. We conducted two experiments with
two humanoid robots in a simulated environment to evaluate our proposed model.
The results demonstrated that the agents can acquire lexical knowledge with
combinatoriality through interpersonal cross-situational learning based on the
Metropolis-Hastings naming game and cross-situational learning. Furthermore,
our results indicate that the lexical knowledge developed using our proposed
model exhibits generalization performance for novel situations through
interpersonal cross-modal inference
- …